Remote Store Programming A Memory Model for Embedded Multicore

نویسندگان

  • Henry Hoffmann
  • David Wentzlaff
  • Anant Agarwal
چکیده

This paper presents remote store programming (RSP), a programming paradigm which combines usability and efficiency through the exploitation of a simple hardware mechanism, the remote store, which can easily be added to existing multicores. The RSP model and its hardware implementation trade a relatively high store latency for a low load latency because loads are more common than stores, and it is easier to tolerate store latency than load latency. This paper demonstrates the performance advantages of remote store programming by comparing it to cache-coherent shared memory (CCSM) for several important embedded benchmarks using the TILEPro64 processor. RSP is shown to be faster than CCSM for all eight benchmarks using 64 cores. For five of the eight benchmarks, RSP is shown to be more than 1.5× faster than CCSM. For a 2D FFT implemented on 64 cores, RSP is over 3× faster than CCSM. RSP’s features, performance, and hardware simplicity make it well suited to the embedded processing domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Remote Store Programming: Reflective Memory for Multicore

This work presents remote store programming (RSP), an instance of the reflective memory model designed to be incrementally supportable on multicores that support loads and stores. To demonstrate the value of RSP, its performance is compared to that of both shared and distributed memory approaches using the TILEPro64 multicore processor. RSP is shown to be as much as 1.76× faster than distribute...

متن کامل

Remote Store Programming: Mechanisms and Performance

This paper presents remote store programming (RSP). This paradigm combines usability and efficiency through the exploitation of a simple hardware mechanism, the remote store, which can easily be added to existing multicores. Remote store programs are marked by fine-grained and one-sided communication which results in a stream of data flowing from the registers of a sending process to the cache ...

متن کامل

Multicore Systems – Challenges for the Real - Time Software

Multicore systems have become the norm for desktop computer systems. The percentage of multicore systems in the embedded domain is still marginal, but growing at an incredible pace such that multicore will become the norm in the embedded area as well. However, embedded systems have additional requirements with respect to safety, reliability, and real-time behaviour. The use of parallel multicor...

متن کامل

Ip - Esc ’ 11 Co - Designed Cache Coherency Architecture for Embedded Multicore Systems

One of the key challenges in chip multi-processing is to provide a programming model that manages cache coherency in a transparent and efficient way. A large number of applications designed for embedded systems are known to read and write data following memory access patterns. Memory access patterns can be used to optimize cache consistency by prefetching data and reducing the number of memory ...

متن کامل

Impact of Thread Synchronization and Data Parallelism on Multicore Game Programming

Xbox-360 has three cores with six logical threads and the PlayStation-3 has one master core and six independent worker cores. According to the current design trends, multicore processors will be ubiquitous in every game computer. A game engine has many ‘components’ and multithreading is an important technique to parallelize the execution of these components. However, effective programming of mu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009